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Abstract. In a recent paper [7] we analyzed a numerical algorithm for computing 
the number of real zeros of a polynomial system. The analysis relied on a condition 
number n(f) for the input system /. In this paper we look at «:(/) as a random variable 
derived from imposing a probability measure on the space of polynomial systems and 
give bounds for both the tail P{k(/) > a} and the expected value E(logK(/)). 
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1 Introduction 

1.1 Overview 

This paper is the third of a series which started with [8] . In the first paper of the series 
we analyzed a numerical algorithm for computing the number of real zeros of a polynomial 
system. This algorithm works with finite precision and the analysis provided bounds for both 
its complexity (total number of arithmetic operations) and the machine precision needed to 
guarantee that the returned value is correct. Both bounds depended on size parameters for 
the input system / (number of polynomials, degrees, etc.) as well as on a condition number 
«(/) for /. A precise statement of the main result in [7] is Theorem 1.1 therein. To the best 
of our knowledge, this theorem is the only result providing a finite-precision analysis of a zero 
counting algorithm. Consequently, as of today, to understand zero-counting computations 
in the presence of finite-precision appears to require an understanding of n(f). 

Unlike the aforementioned size parameters, the condition number n(f) cannot be read 
directly from the system /. Indeed, it is conjectured that the computation of k(J) is at 
least as difficult as solving the zero counting problem for /, so we need a much depper 
understanding of k(/). In the second paper of the series [8], we attempted to provide 
such an understanding from two different angles. Firstly, we showed that a closely related 
condition number k(/) satisfies a Condition Number Theorem, i.e., n(f) is the normalized 
inverse of the distance from / to the set of ill-posed systems (those having multiple zeros). 
The relation between the quantities k(/) and k(J) is close indeed (see [5J Prop. 3.3]): 

^£<k(/)<V2^«(/). 

Secondly, we used this characterization, in conjunction with a result from [5J, to provide a 
smoothed analysis of /?(/) (and hence, of n(f) as well). A smoothed analysis of the com- 
plexity and accuracy for the algorithm in [7] immediately follows. Details about smoothed 
analyses and distance to ill-posedness can be found in the introduction of [8 . 

As a consequence of the smoothed analysis of k(/) one immediately obtains an average- 
case analysis of this condition number. One is left, however, with the feeling that the 
bounds thus obtained are far from optimal. Indeed, these bounds follow from a result which 
is general in two aspects. Firstly, it is a smoothed analysis (of which usual average analysis 
is just a particular case). Secondly, it is derived from a very general result yielding smoothed 
analysis bounds for condition numbers satisfying a Condition Number Theorem and stated 
in terms of some geometric invariants (degree and dimension) of the set of ill-posed inputs. 
The question of whether a finer average analysis can be obtained by using methods more 
ad-hoc for the problem at hand naturally poses itself. 

In this paper we show that such bounds are possible. Loosely speaking, the average 
analysis in [8] shows a bound for a typical k(/) - or k(/) - which is of order T> 2 where T> 
is the Bezout number of /. Here we show that \D is a more accurate upper-bound. This 
improvement is meaningful, since T> increases exponentially with n. Our main result implies 
that if the maximum degree D remains bounded as n grows, E(ln n(f)) is bounded from 
above by a quantity equivalent to ln^ 1 ' 2 ), which according to the Shub-Smale Theorem, 
see [19] , equals the logarithm of the mathematical expectation of the total number of real 
roots of the polynomial system. More precisely, 
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h rTin^} 



No non-trivial lower bound has been obtained for the time being as far as we know. 
We next proceed to set up the basic notions and notations enabling us to state the above 
in more precise terms. 

1.2 Basic definitions and main result 

For d £ N we denote by T-Ld the subspace of R[xo, • • • , x n ) of homogeneous polynomials of 
degree d and, for d :— [d\, . . . , d n ), we set Tid '■= Tid 1 X ••• X Tid n - We endow Tid with the 
Weyl norm which is defined, for / <E Hd, /(x) = J2\j\=d a 3 xJ > by 

a 2 

ll/llw = 2^ JdT 

\j\=d \j) 

where x = (x , ..., x n ), j = (j , . . . ,j„), \j\ := j a H h j n , x 3 = x 3 ° ■ ■ ■ x£™ and (f) := 



(/! 



!••••)„! 



Jo'— J 



We then endow %& with the norm given by 

11/11 : = max \\fi\\w 

l<i<n 

For / = (fi, . . . , /„) G Hd, as in [7], we define the following condition number 

f llfll 

«(/) = m ax min ^ /i„orm(/, a;), ,, , ,,, 
* eS " I Il/Wllc 

with 

/*»onn(/,a;) = Vn||/|| ll^^-'Mll. 

Here 

• D x (f) = Df(x)\r :c S" is the derivative of / along the unit sphere S n C R™ +1 at the 
point x, a linear operator from the tangent space T x (S n ) to R", 



• M := 



fd~i 



is the scaling n x n diagonal matrix with diagonal entries 



the square roots of the degrees ck = deg(/j), 

• the norm \\D x (f)~ 1 M || is the spectral norm, i.e., the operator norm 
m&x{\\D x {f)- 1 M y\\ 2 ;y e S n ,y±x} with respect to || || 2 , 

• ||/(x)||oo = maxi<i<„ |/i(x)| denotes as usual the infinity norm. 

We next impose the probability measure on T-Ld defined by Eric Kostlan il5 and Shub- 

Smale [19]. This measure assumes the coefficients of the polynomials fi — Ylu\=d- a i x ^ are 
independent, Gaussian, centered random variables, with variances 

Var(a«) = (* 



For this distribution, and for x, y £ R n+1 , 1 < i, k < n, covariances are given by (see Lemma 
EJ below) 

m t (x)f k (y)) = 8 lk (x,y) d > 

where Sik is the Kronecker symbol. 

This probability law is invariant under the action of the orthogonal group and permits to 
perform the computations below, which appear to be much more complicated under other 
distributions not sharing this invariance property. 

To state our main results a number of quantities will be useful. We use the notation 
D := max d t , V = T\ d,,, N := dimU d = > ( 

Ki<n ± L £-^l \ n I 

We note that V is the Bezout number of the polynomial system. We may assume here that 
di > 2 for 1 < i < n since otherwise we could restrict to a system with fewer equations and 
unknowns. Notice that N < n D+2 . 

We are now ready to state our main result. 

Theorem 1.1. Let the random system f satisfy the conditions of the Shub-Smale model 
and assume n > 3. Then, 

(i) For a > 4^2 D 2 ?! 7 / 2 ^ 1 / 2 one has 

n K (f)>a)<K n ^ 1 + ^ a ^ 1/ \ 

a 

where K n •- SD 2 V 1 / 2 N 1 / 2 ^/ 2 + 1. 

(ii) 

E(hiK(/)) < lni<r n + (\nK n ) 1/2 + (lnK n )- 1/2 + -ln(2n). 

In fact we are going to prove the corresponding result for the alternative quantity /?(/) 
already considered in [5] , since it will enable us to use L 2 methods, which are more adapted 
to the type of calculations we will perform. We recall that 

~/ ft ll/llw 



min xeS ,.{||^(/)-iAf||- 2 + ||/( 2; )|| 2 }) 
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wnere WfWw '■= T,i<i< n \\Mw is the We y! norm of the system and ||/(a;)|| 2 := 
Si<i<n fi( x ) 2 denotes the usual Euclidean norm. As we have already mentioned, we have 
/^- —\ K (/) —\ V%n «(/)• Also, as a consequence of [8j Th. 1.1], «(/) satisfies «(/) > 1 for 
all"/ £ H d . 

We will therefore obtain Theorem ll.il as a direct consequence of the following result. 

Theorem 1.2. Let the random system f satisfy the conditions of the Shub-Smale model 
and assume n > 3. Then, 



(i) For a > 4DVJV 1 / 2 one has 

P(R(/) > a) < /^ (1 + lna)1/2 
a 

w/iere #„ := 8D 2 !? 1 / 2 N 1 ' 2 ^' 2 + L 

(H) 

E(lnK(/)) < lnK n + (lnK n )^ 2 + (InK^ 1 / 2 . 

Theorem [TTT1 follows from P(«(/) > a) < P(«(/) > a/\/2n), since «;(/) > a =*> «(/) > 

a/\/2n. 

The proof of Theorem ll.2l is given in Section [2] It requires a certain number of auxiliary 
results. With the aim of isolating (and in this way highlighting) the main ideas, we will 
postpone the proof of these auxiliary results to Section [3J though stating them as needed in 
the text. This will be indicated by the symbol at the end of the statement. 

1.3 Relations with previous work 

Probably the most successful combination of algorithmics, conditioning, and probability 
occurs in the study of complex polynomial systems (a setting similar to ours but with the 
coefficients of the polynomials now drawn from C and considering projective complex zeros). 
This study spans an impressive collection of papers, which began with [TBI HH1 HH1 US HI] 
and continued in [3] and [TTJ H] . The final outcome of these efforts is a randomized algorithm 
producing an approximate zero of the input system in expected time which is polynomial 
in the size of the system. The expectation is with respect to both the random choices in the 
algorithm and a probability measure on the input data. 

The condition number of a system / in this setting is defined to be 

Mnorm(/) := max Mnorm(/,C)' 

CeS£|/(C)=o 

Here /x no rm(/)C) is roughly the quantity we defined above. Over the reals, it may not be 
well-defined since the zero set of / may be empty. If one restricts attention to the subset 
TJ-d C "Hd of those systems having at least a real zero one may similarly define a measure 
Mworst (/) ; maximizing over the set of real zeros. This has been done in [5] where bounds 
for the tail and the expected value of /i WO rst(/) are given. These bounds are very satisfying 
(for instance, the tail P(^ worst > a) is bounded by an expression in a~ 2 , a fact ensuring the 
fmitcncss of E(/z wors t (/)))• The measure A* W orst(/)) however, is hardly a condition number 
for the problem of real zeros counting, not even restricted to the subset TZ^- To understand 
why, consider a polynomial as in the left-hand side of the figure below. 





For this polynomial one has /i W orst = oo. 

An upward small perturbation (as in the right-hand side) yields a low value of fJ, WO rst- 
This value admits a finite limit when such perturbations are small enough! The measure 
Mworst(/) appears to be insensitive to the closeness to ill-posedness. This runs contrary to 
the notion of conditioning [121 1131 021 EH] . 

A condition number /i* (/) for the feasibility problem of real systems (which, obviously, 
needs to be defined on all of Hd) was given in [9] by taking 

mm Mnorm(/, C) if / € 'R-d 
CeS"|/(0=o 



max 77-—. r-r. otherwise. 



M*(/) 

ies- || /( x ) | 

As of today, there is no probabilistic analysis for it. 



2 Proof of Theorem Q 

The proof relies on the so-called Rice Formula for the expectation of the number of 
local minima of a real-valued random field. This is described precisely in Step 2 below. 
Previously, in Step 1, we use large deviations to show that for large n, except on a set of 
small probability, the numerator \\f\\w in «(/) is nearly equal to iV 1 / 2 . Steps 3, 4, and 5 
estimate the different expressions occurring in Rice formula. Finally, Step 6 wraps up all 
these estimates to yield the upper bound for the density and Step 7 derives from it the 
bounds claimed in the statement of Theorem 11.21 

During the rest of the proof, we set L = L{.f) : = min !ce sn{||£) a; (/)- 1 M||- 2 + ||/(x)||i} so 
that k(/) = ll/H w/VX- We observe that 

WDxUyHlW- 1 - anUM^DM)) = miniWM- 1 D x (f)y\\ - V € S n ,y ± x}, 

(where <T m i n denotes the minimum singular value), and therefore 

L = mmiWM-'D^yf + \\f(x)f 2 : x,y G S n ,y J_ x} 



is the minimum of the random field {L(x,y) : (x, y) € V} where 
L(x,y) := \\M- l D x {f)y\\ 2 + \\f(x)\\l 

n 1 / n \ n 

= E j. E 9,M*Wi(*)vm +£/*(*); (^ 

i=i l \i,fe=0 / i=i 

and V := {(x, y) e R™ +1 x M ,l+1 : ||a:|| = ||y|| = 1, (x, y) = 0}. 

Here y = (yo, . . . ,y n ) and, for 1 < i < n and < j < n, djfi(x) denotes the partial 
derivative of fi with respect to Xj at the point x. 

Step 1. Our first step consists in replacing the Weyl norm in the numerator of k(/) by a 
non-random constant, at the cost of adding a small probability, which will be controlled 
using large deviations. 

Let a > 1. We have 
P(«(/) > a) = P (j^- < 1^\<f(l<^(1 + kia)N) +p(||/||^ > (1 + lna)7V 

We bound the second term in the right-hand side above using the following result that will 
be proved in Section [3] 



Lemma 2.1. Set 

N :=dim-H d = 

Then, for 77 > 0, 



> (1 + t))n\ < e-T (u-M»/+i)). 



Therefore, setting 77 — ; In a, we obtain 

P(k(/) > a) <P (l < -^■(H-lna)TVj + exp ( (In 0- ln(lna+ 1) J . (2) 

The second term in the right-hand side above can be easily estimated. We therefore turn 
our attention to the first. Given a > 0, we want to compute an upper bound for 

P (L < a) . 



Step 2. Our second step consists in giving a bound for the density function pl(u) of the 
random variable L, i.e. such that 

P (L < a) = / pl{u)cIu 
Jo 

since L is non-negative. We recall that the quantity L is the minimum of the random field 
{L(x, y) : (x, y) € V}, for L and V defined in Formula (JXJ) . 



Notice that V is the Stiefel manifold 5(2,71 + 1), a compact, orientable, ^^-differentiable 
manifold of dimension 2n — 1, embedded in R n+1 x R™ +1 . For each linear orthogonal 
transformation U of R n+1 , define U : V —> V, (x, y) )-> (Ux, Uy), and denote by U the set 
of these U provided with the group structure naturally inherited from the orthogonal group 
in R" +1 . Then U acts transitively on V. 

At a generic point (x,y) of the manifold V, the normal space Nr XtV \ (V) has dimension 

(2n + 2) — (2n — 1) = 3, and is generated by the orthonormal set I (x,0), (0,y), -75(2/, #) f. 

Therefore, if {2:2, • • • , z n } C R n+1 is such that {x, y, z%, ■ ■ ■ ■, z n } is an orthonormal basis of 
R" +1 , the set 

Br ( . ll0 := { (22, 0), . . . , (z„, 0), (0, z 2 ), . . . , (0, z n ), -L(y, -x)\ (3) 

is an orthonormal basis of the tangent space T( xy }(V). 

We denote by ay (d(x, y)j the geometric measure on V (i.e. the measure induced by the 
Riemannian distance on V), which is invariant under the action of the group U. The total 
measure satisfies 

<t v (V) = V2cr„_icr„, (4) 

where a^ = 27r^ fc+1 ^ 2 /r((fc + l)/2) is the total fc-th dimensional measure of the unit sphere 
S k , see for example [2 Lemma 13.5]. 

For a > and S a Borel subset of V, we denote by m a (L, S) the number of local minima 
of the random function L on the set S, having value smaller than a. Clearly: 

P(L<a) =P(m a (£,V) > 1) <E(m a (L,V)). (5) 

Our aim is to give a useful expression for the right-hand side of Formula ([5]). For that 
purpose, let us set for each Borel subset S of V, v(S) := E(m Q (L, 5)). Clearly, v is a 
measure. The invariance of the law of the random field {L(x, y) : (x, y) £ V} under the 
action of IA implies that v is also invariant under li. 

Let ip : £>2n-i,c5 —> R' l+1 X W l+1 be a chart on V, that is, a smooth diffeomorphism 
between the ball in R 2 ™ -1 centered at the origin with radius S > and its image 
W = ^(£ 2 „_m) C F. 

We denote by L : B2 n -i,S — > R the composition £(w) = L[ip(w)). 

As we already mentioned, our main tool is Rice formula, of which we now present a quick 
overview: 

Let U be an open subset of R™ and Z : U -> R™ a random function having sufficiently 
smooth paths. Let us denote by v z (S) the number of zeros of Z belonging to the Borel 
subset S of U. Under certain general conditions on the probability law of Z, one can 
compute the expectation of v z (S) by means of an integral on the set S. The integrand is a 
certain function depending on the underlying probability law. 



The simplest form of such a formula is the following: 

E(u z (S)) = [ E(\det(Z'(t))\/Z(t) = 0)p m (0)dt (6) 



is 

One must be careful in the choice of the version of the conditional expectation and the 
density pz(t){') of the random vector Z(t), since they are only defined almost everywhere. 
But this can be done in a certain number of cases in a canonical form, in such a way that 
the formula holds true. 

This kind of formula can be extended to a variety of situations, such as: a) the zeros of Z 
can be "marked" , which means that instead of all zeros, we count only those zeros satisfying 
certain additional conditions; b) the domain can be a manifold instead of an open subset of 
Euclidean space; c) one has formulas similar to © for the higher moments of u z (S); d) the 
dimension of the domain can be larger than the one of the image, in which case the natural 
problem, instead of counting roots, is studying the geometry of the random set Z~ 1 ({0}). 
For a detailed account of this subject, including proofs and applications, see [21 Chapters 3 
and 6]. 

Here we want to express by means of a Rice formula the expectation 

v{S) = E(m a (L,S)) =E(to q (L,V>~ 1 (S'))) 

In our case, with probability 1, m a (L,Tp~ 1 (S)) equals the number of points w £ ip^ 1 (S) 
such that the derivative L'(w) vanishes, the second derivative L"(w) is positive definite 
and the value L(w) is bounded by a. Then, under certain conditions, we can write (use [21 
Formula (6.19)], mutatis mutandis): 

1/(5) = E(m a (L, S)) = E(ro a (Z, ^(S))) 

du I ^ E(\det(L"(w))\x { z„ {w)y0} /L(w) = u,L'(w) = Oj p I(w)Z , (w) (u,0) dw. 

(7) 
Here \A means indicator function of the set A, >- means positive definite, pr, , -?,, •. is the 

joint density in K 1 xM 2 " -1 of the pair of random variables (L(w),L'(w)), and dw is Lebesgue 

I Icy 

measure on IK 2 " -1 . Note that in the chart image, day = (det ((ip' (w))*^' (w))) dw. 



In [21 Proposition 6.6] it is proved that if the integrand in Formula (J?} were well-defined 
then the change of variable formula would be satisfied, so that v(S) would be the integral 
of a (2n — l)-form. In that case, Formula (J7J would already imply that the measure v is 
finite and absolutely continuous with respect to try, so that one could write for each Borel 
subset S of V 



v( s ) =9 day 
Js 

for a continuous function g. Let us prove that in that case the Radon-Nikodym derivative 
g would be constant. To see this, notice that ay is also invariant under IA and the action of 
this group is transitive on V . If g takes different values at two points {x\,yi) and (#2,2/2) 



of V, letting (JeWbe such that U{x\,y\) = (X2, 2/2), we can find a small neighborhood S 
of (xi, yi) such that 



g day ^ 9 day, 

S JU(S) 

contradicting the invariance of v. 

We could then compute the constant g by computing it at the point (eo, ex). We choose the 
chart ip in such a way that V'(O) = (eo, ei) and (^'(0)) ip'(0) = fan—i and compute 

r/(^(B 2n _i >£ )) 
g = hm 



>0 cry(^(B 2n _i :E )) 

^ E (| det(Z"(0))| X{Z „ (0)MD} /|Z(0) = u,Z'(0) = 0) P Z(0) , £ , (0) (u,0) du 

So, if Formula ([7]) were true, it follows that we could write 

v(S) = * v (S) I E(|det(Z"(0))| X{Z „ (o) ^ o} /Z(0) = U! Z'(0)-0)p Z(o)Z , (o) (u,0)^. (8) 

However, if one computes the ingredients in the integrand of the right-hand side of For- 
mula ([7]), it turns out that the value of the density is +00 and the conditional expectation 
vanishes. So, the formula is meaningless in this form. 

To overcome this difficulty we proceed as follows: 

Let Sr x ,y) — span(z2, . . . , z n ) C W l+1 be the orthogonal complement of span(x, y) C M. n+1 
and ir XjV : R n+1 — > S^y) be the orthogonal projection. For (x, y) € V, we introduce a new 
random vector Ox,y) defined as 

C(x,„) := ((^,y(fM),d yy Mx)) : l< i < n) G (S (Xiy) x R) n - M" 2 , (9) 

where for 1 < i < n, f[(x) is the free derivative (the gradient) of /, at x, the first (n — 1) 
coordinates are given by the coordinates of the projection of fl(x) onto Sr Xt y-\ in the or- 
thonormal basis {Z2, ■ ■ ■ , z n } and the n-th one is the second derivative in the direction y at x. 

Then, instead of Formula ([7]) we write the formula 
E(m a (L,S)) = [ du [ dw f E(\dct(L"(w))\-x a „ {w)y0} /L(w)=u, 

JO JtlJ- 1 (S) J(S^ (ro) xR)" v 

L'(w) =0,C^(w) =z) -PZm.Z'M.c^)^' '^ dz - 

(10) 

Formally, Formula (J7J is obtained from Formula (|10p by integrating in z. 

To prove the validity of Formula (|10l) one could follow exactly the proof of [2J For- 
mula 6.18] if the random field {L(x,y) : (x,y) £ V} were Gaussian. This is not our 

10 



case. However, it is in fact a simple function of a Gaussian field, namely it is a quadratic 
form in the coordinates of / and its first derivatives as shown in Formula ([T|). It is 
then easy to show that Formula (ITU1) remains true as it is done for the general Rice 
formulas in [2, Ch. 6, Section 1.4]. This requires proving: (a) the existence and regularity of 
the density Pzr w \ Z'(w) c ( u ' ^> z ) an d 3 ) with probability 1, is a regular value of L'{w). 

(a) is contained below in the present proof (see Step 4). As for (b), once the regularity of 
this density will be established, it follows in the same way as [2] Proposition 6.5 (a)]. 

So, using exactly the same arguments leading to Formula © we get: 
E(m a (L,V)) = a v (V) [ du [ E (| det(L"(0))| • X { Z»(o)yo} /HO) = «, 

JO J(S^ 0) xK)« V 

Z'(0)=0,C*(o) = z ) •PZ ( o),Z'(o),^(o) (u '°' z) dz - 
Finally, taking into account Inequality (JS|) we can conclude that: 



Pl{u)<<jv{V) / E(\det(L"(0))\-x {Z „, 0>0} /L(0)= U , 

L'(0) =0,^(0) = ») •PZ(o),Z'(o),Cv,(o)( u ' ' z ) dz - 

Step 3. For the rest of the proof we fix the following orthonormal basis Bt (given in ©) 
of the tangent space T :— T e0tei : 

6 T =((e 2 ,0),...,(e„,0),(0,e 2 ),...,(0,e n ),-^(ei,-e )). (12) 

Let us recall that in the right-hand side of Inequality (fTTj) the values of L(0), Z/(0),L"(0) 
are computed using a chart ip of a neighborhood of (eo,ei) such that V'(O) = ( e 0jei) and 
the image by ip' of the canonical basis of R 2 " -1 is an orthonormal basis of the tangent 
space T, that we set to be Bt- 

We introduce, for (x,y) G V, the gradient VL(x,y) which is the orthogonal projection of 
the free derivative L'(x, y) onto the tangent space T( x , v ) and is obviously independent of the 
parametrizations of the manifold V. One can check by means of a direct computation that 

V£(e 0l ei)=Z / (0)(^(0)) t . 

Then, using the change of variables formula for densities and the fact that (^'(O)) ip'{0) = 
hn-i, we have: 

Pi(0),Z'(0),C,f,(o) ("' ' Z > = - P L(eo, e i),vZ(e ,ei),C(e .e 1 ) ^' ' Z >' 

Notation. To simplify notation, from now on we write fi (resp. dkfi and dkifi, 

< k,£<n) for fi(e ) (resp. d k fi(e ) = §^(e ), d kI Me ) = a fJ^ (e ), < k,£ < n). In 

11 



the same spirit we write L for L(eo,ei) = L(0), VL for VI/(eo,ei) and L" for L"(eo,ei). 
Finally we write £ for £(eo, ei) and 5* for S( eo ^ ei y 

Under this notation, Inequality (jll[) becomes: 
Pi(u)<ay(y) f E(|det(Z")| ■x { Z„y O x/L = u,VL = 0,( = z)p LvZc (u,0,z)dz. 

J(SxR)™ V ^ '/ / > >■» 

(13) 
According to the definition of L(x,y) in ([1]) we have 

n 1 n 

L = E^/») 2 +E/- ( 14 ) 

i=l * i=l 

and, from Definition (JSJ), 

C := Ce , ei = ((daft, . . . , d n fi, dnfi), KKn)eI" ! . (15) 

We also set [VL]g T := (£2, ■ • ■ , £ru V^i ■ ■ ■ > Vn, o) for the coordinates of the gradient VL in 
the basis Bt- 

Using that the (free) partial derivatives of L at (eo, e±) are given by 

„ n n 

— (eo, ei ) = Y,j(dkimdif t ) +J2 2 M9kfi) for < k < n 

i—1 i—1 

3L n 2 

— (e ,e 1 )=Y tT (d 1 f i )(def i ) for < £ < n, 
dye ~[ di 

we obtain 

& = (X'Ceo.ei), ( ej ,0)> = 2^2-(d li fi)(d 1 fi) + 2^/^/*), 2 < j < n, 

8=1 * 1=1 

n 1 

^ = (2/(e 0> ei), (0,e,)) = 2^ (d^djfi), 2<j< n, 

Z— 1 

Q = {L'{e Q ,e l ),2- x /\e l ,-e )) (16) 

= ^[E ^(difiWufi) + E /i(^i/i)] - ^E rf-( 5 o/«)(ai/«) 

2 — 1 Z— 1 i — 1 

n 1 

z— 1 

Here, ( , ) denotes the usual inner product in R™ +1 x M. n+1 and the last equality in (fT6"|) 
follows from the equalities dofi — difi for 1 < i < n which are easily verified. 

Step 4. In this step we focus on the term p L _j .(u, 0, z) of (JTSJ). To this aim we factor 
this density as 

Pl,vZ,c( m '°' z ) = 1l,vl/c=z( u >°) ■ Pd z ) ( 17 ) 
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where q L vl/<:=z( u >0) denotes conditional density. 

To study the two terms in the right-hand side of (|17|) . we need a lemma containing the 
ingredients to compute the distributions and conditional expectations appearing in our 
proof. 

Lemma 2.2. Let f G WL[Xo, . . . ,X n ] be a homogeneous random polynomial of degree d. 
Assume that f follows the Shub-Smale model for the probability law of the coefficients, i.e. 
the coefficients of the polynomial f = ^2\j\—^0>jX 3 are independent, Gaussian, centered 
random variables with variances 

Var(aj) = ( 

Then 

• For x,y G K" +1 , the covariances satisfy 

E(f(x)f(y)) = (x,y) d Vx,y€R n+1 , 

where ( , ) is the usual inner product in R n+1 . 

Moreover, if eo := (1,0, . . . ,0) is the first vector of the canonical basis o/R™ +1 and we write 
f (resp. d k f and d ke f, < k,£ < n) for /(eo) (resp. d k f(e Q ) = -§£-{e ), d M f(e ) = 

q — Q~( e o)? < k,£ < n), we get the following covariances: 
E (fd k f) = S ko d for 0<k<n. 

E ((d k f)(dk'f)) = 6 kk , [d + S ko d(d - 1)] for 0<k,k'<n. 
E(f(d ke f))=5 ki 8 kQ d(d-l) for 0<k,£<n. 
E((d ke f)(d k ,f)) = d(d-l)[(d-2)de SkoS k > + 5 ko 6 k , £ + d m 6 kk ,] for 0<k,k',£< n. 

E((cW)(cW/)) - d(d - l){(d - 2)(d - tykoSaSvoSi'o + (d - 2)[4o<W«' + 
Sk'oSioSkt'+hoSi'oSk't+StoSi'oSkk']+Skk'Sw+Siw5k'tj f° r 0<k,k',£,£'<n. 

We proceed with the study of the two terms in the right-hand side of (fT7|) . 

Computation of P((z). By Lemma I2.2[ the n 2 coordinates of £ in (|15[) are independent 
Gaussian centered random variables satisfying that Var(d k fi) — di and Var(<9n /j) = 2di(di — 
1) for 1 < i < n and 2 < k < n. 

Although we are not going to use the exact expression in the sequel, we can immediately 
deduce for z = ((Zi2, ■ ■ ■ , Zi n , Zm), 1 < i < n) that 

Pc(z) = (2-) 1 " a/2 nr =1 di"- l)/2 n?=i(2*(*-i)) 1 / 2 cxp ("* § (§ * + ^^ d 

Computation of q L vl/<:=z(0) : We f ac tor it as follows: 

13 



Remembering that (V-L) „ := (£2, • • • , Cn, T]2, ■ ■ • , Vni £?)> we can write <7vZ/c=z^) as 

9VL/C=z(°) = 2(£2,...,£„)/(>72.-,'M.e)=0, C=*(°) •9(r; 2 ,...,-»7„,e)/C=0( )- 

First we compute Q , (n 2) ...,j7„,o)/c=«(0)- The condition C — z says that for 1 < i < n and 
2 < 3 < Ti, 9j/i = Zy and 9n/i = Zjii. Therefore, from Identities fj 16|) . we have 



/*\ 



Vn 

V e y 



= A(z) 









where A(z) 



yar 



^12 



2 



Z«2 \ t 



= Zl r 



/3T / ' nn 



\ $*« 



ra-1 

I (18) 



T?:^ 1 / 1 



is non-singular for almost every z £ M n . Applying again Lemma |2.2[ difi/^fdi, 1 < i < n, 
are independent standard normal random variables that are independent from £. By the 
change of variables formula, we get 



9(r, 2 ,...,»7„,e)/C=z( ) - 



1 



1 



(2tt)™/2 \det A(z)\' 



Now we compute Q( &,..., £ n )/ fa, ■■■,vn,e)=o,C=z(®)' Since A(z) is non-singular for almost every 
z, the condition 772 = . . . = r\ n = g = implies di/i = for 1 < i < n. Therefore, from 
Identities (jTHJ) and since £ — z, we have 




Again, f% , . . . , /„ are independent standard normal variables independent from 
(rj2,...,r] n ,g,C) and thus 



1 



1 



ff(&,..,e„)/ta,..,w)=o, C=*(°) ( 27r )(n-i)/2 ' 2«- 1 (det(B(z)B(z) t )) 1 / 

where B{z) 1 denotes the transpose of the matrix B(z). 
We therefore obtain 

9vZ/c=J°) = 9(6,-,e»)/('72,...,'7»,e)=o, C=*(°) •Q'fe,...,^^)/^^ ) 

1 



(2 7 r) n -52«- 1 |detA(z)|(det( J B(z) J B(z)*)) 1 /2 



Finally we compute 9 i / V L=o c=z( w )- The conditions VL = and C, — z imply by (|T8")) and 
([T6l) that 9i/i = for 1 < i < n and ^2^=1 fi z ij = f° r 2 < j < n for almost every z . 
Plugging the former into ([Til) we get 



*=E# 
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and the latter says that the vector (/i, . . . ,/„) is orthogonal to the (n — l)-dimensional 
subspace S spanned by the n — \ vectors (zij, . . . ,z n j), 2 < j < n. This shows that 
/i + • • • + fn, the square of the distance of (/i, . . . , /„) to S, has the Xi-distribution, since 
the property of being a vector of independent standard normal variables is independent of 
the choice of the orthonormal basis. So, for u > 0, 



9l/vl=o, (=z( u ) 



,-u/2 



'2iru 
We therefore obtain 

9l,vZ/c= z ( m '°) = <? L/v z =0iC= » • «vZ/c= 2 (°) 

e -«/2 



(27r)»2»- 1 |dct(A(z))|(det(B(z)B(z) f )) 1/2 ^ 

Plugging this expression into Identity (fl7|) we obtain 

-u/a 



Pl,vl,c( u ' ' ) 



1/2 



(27r)"2"- 1 |det(A(z))|(det(B(z) J B(z)*)) V V" 



p c (z). (19) 



Step 5. In this step we focus on the conditional expectation 

E (| det(Z") | • X { Z» y0 }/ L = w > vZ = °> C = z 



(20) 



in the integrand of (|13p . We obtain the following expression for L" under the stated 
conditions. 



Lemma 2.3. Let M be the symmetric block-matrix W 2n 1 ) x ( 2 ™ l > f the linear operator 
L" , under the conditions L = u,VI = and £ = z. Let f* be any solution of the system 
Z)iLi fiZij = 0, 2 < j < n, and Yh=i If = u - Then 



ra-l 



M = 



n-l 



f M aa 


M aT 


M a6 \ 


M Ta 


M TT 


M Tf) 


V M 9a 


Mg T 


M 9e ) 



n-l 
n-l 
1 
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wh 



en 



(M aa ) j:j - 2 J2 (-T^fi) 2 + 4 + fiWiifi) - d *f: 2 ) f° r 2 < 3 < n, 



i=i 



(M aa ) 3k = 2j2(j,(d lj f i )(d lk f i )+z ij z ik + f:(d jk f i yj for2<jj^k< 

i— 1 ^ ' 

n 1 

(M aT ) jk = 2 ^ -j-(dijfi)z ik for 2 < j, k < n, 

i—l % 
n 1 

(M ae )ji = V2j2^-(9ijfi)zui for2<j<n, 

n 1 

{M TT )j k = 2 J^ —ZijZik for 2< j,k <n, 

n 1 

(Mrejj! = \/2^ -j-zaiZij for 2 < j < n, 

£= 1 * 

2^ X^ii _ /< ^ii 



Mee = , , , 

8=1 V 

Proof. The hypotheses imply that for almost every z, one has difi = for 1 < i < n, 
Y^i=i fi z ij = for 2 < j < n and X)"=i fi = M - The last two conditions give a system of n 
equations and n unknowns with exactly two solutions /* = (/*, . . . , /*) and — /* for almost 
every z and u > 0. Moreover the symmetry of the Gaussian distribution implies that the 
law of the coordinates of the matrix M does not change under the stated conditions when 
replacing fi, ■ ■ ■ , f n by either one of these solutions. The formulas are then a consequence 
of Corollary 13. 21 of Section [3] (here we use that <9o/i = difi and skip the details). 

□ 

For z fixed, the only random variables that appear in the elements of M are the second 
partial derivatives djkfi, 2 < j, k < n and dijfi, 2 < j < n, 1 < i < n. Therefore, we are in 
condition to apply the following result which gets rid of conditioning in (1201) . 

Lemma 2.4. Let X = (Xq)i<Kp,i<j< 9 be a real random matrix and Y = (Yi, ..., Y q ) , Z = 
(Zi, ..., Z p Y be real random vectors. Assume that X, Y, Z are independent, the distributions 
of X, Y and Z have bounded continuous densities, respectively in K pX9 , W, R p and that 
Py(-) andpz(-) do not vanish. Let g : W xq — > R be continuous, such thatE(\g(X)\) < +oo. 
Then, for any u € W , 

E(g(X)/XY + Z = u,Y = 0) =E(g(X)). 

The heuristic meaning of the previous lemma is that if we know that Y = 0, then XY + Z 
does not give information on the distribution of X. 

For X = (4-dufi) e R("- 1 ) x " and Y = (d 1 f 1 ,...,d 1 f n ) t in the previous 

\ ai 1 2<j<n,l<i<n 

lemma we obtain that 

E(\det(L")\- X{I „ yo} /L = u,VL = 0,{ = z) = E(| det(M)| • X {Myo } ) ■ (21) 

1G 



We now consider E(|det(M)| -X{m>o})- We observe that it is now an unconditional 
expectation. We will bound it in terms of u and z. We begin by writing the matrix M in a 
form that will be useful for our computations. 

Notation To simplify notation, from now on we simply write A and B for the matrices 
A{z) and B(z) of Step 4. 



We first observe that 
where 



M oa = VV f + 2BB f + W- nl n -x 



( t! 9 -/! . . . ju 12/ „ ^ t 



n-1 



v 



and 



V2 



V2 



V ^«/l ■ • • 3C W» J 



2-1 , W 



:=2^>/* 



2£r=i/^22/, 

2 Si=l f*dn2f-< 



2j2i=lfid2nfi 
2 L_ii = \ Ji Onnji 



Also, introducing for 1 < i < n and 2 < j < n, 



Zij . — j—Zij 7 Zj . — yz\j 7 . . . , z n j j , r> d\z) 



and 



-2 



^12 



Zlr. 



z "2 \ t 

n-1 



Zjii := — =^n, fi := Vdi/* and ?n := (z m , . . . , z„n), / := (/i, . . . , /„) 



t 

n-1 
4 



so that 



£ 



A = \ 



75 Zl1 



n-1 
1 



we get 

M aT = -^VB\ M ae = ^V2ii, M Tr = ^BS', M tB = ^Bz\ x and M flfl = I^^-i^i/*. 



Therefore 



M 



n-1 



n-1 



/ W* + 2BB* + w - 


- M-fn-1 


75^ 


2^11 \ 


T2 Svt 


±BB* 


^11 


V 3^11^ 




^2iiB* 


jS'h^h - 2 Zl1 /* / 
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The coefficients of the matrix W appearing in the first block are the centered Gaussian 
random variables {2^2 i=l f*djkfi '■ 2 < j < k < n} which are independent. Applying 
Lemma 12.21 we obtain 



a 2 := Var(2j2 ffdjhfi) = 4^>(d, - l)ff < 4D(D - 1)« for j ? k, 

n n 

Var(2j2 ft djjfi) = 8^>(d, - l)ff = 2a 2 . 



(22) 



As a consequence, dividing each coefficient of W by o\Jn — f , one can write the matrix W 
in the form: 

where G is a real random symmetric matrix with entries a»j which are independent 
Gaussian centered satisfying that Var(ajj) = 1/n for i ^ j and Var(ay) = 2/n for i = j. 

We continue now with the bound for E ( det(M) • X{Myo})- The randomness for this ex- 
pectation lies in the matrices V and W, which are stochastically independent by Lemma l2~2l 



Denote by A the maximum between and the largest eigenvalue of the matrix G. Using 
the independence of V and W, and the fact that the determinant of a positive semidefinite 
matrix is an increasing function of the diagonal values, we get 



E(|det(M)|- X{ M^o}) < E(|det(M 1 )|- X{Ml ^o}) 
where Mi is given by: 



(23) 



Mi = 



/ VV t + 2BB t + <7^/E\I n _ 1 


-^VS* 


\V%x \ 


T2* Vt 


|BB* 


^Bz\i 


\ kznV* 


^n-B* 


\ziizi-t - -Zllf J 



\ „- 



We note that 



where 



and 



M = 



det(Mi) = det(M 2 ) - -zuf det(M ). 



VV t + 2BB t + a y /EXI n ^ 1 



7=2 BVt 



^ VBt 



\B& 



(24) 



M 2 = 



/ VV t + 2BB t + o^\I n - 1 
V 



75 BV * 



kziiV 1 



W* 



\BB* 



^znB* 



ivzii \ - 1 



VI By* 
4 nz ll 



\zxiz\x J 
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Observe that Mq and M 2 can be written as 



Mo = No Nl and M 2 = N 2 N l 2 



where 



^0:=! 



n 

V 


n 

V2B 


n-l 


* 8 









ra-1 



and 



n-1 



A? 



/ V 


V2B 


(a^A) 1 ^ 


- \ 


^B 








V 2^11 








/ 



n-l 



Therefore they are both positive semidefinite. Moreover det(M 2 ) is the square of the (2n— 1)- 
volume of the parallelotope generated by the 2n — 1 rows of A~ 2 . This volume equals the 
distance from the last row to the subspace generated by the rows of No times the volume 
of the parallolotopc defined by these 2n — 2 rows. The distance from the last row to the 
subspace generated by the rows of Ng is bounded by the distance to the smaller subspace 
generated by the n — 1 rows of the matrix 



( V2 



BOO 



which is clearly equal to 



dist [ -2:11,5 



where S := span(z 2 , • ■ • , z n ) c R™. Now we recall that (/j* , . . . , /*) satisfies the conditions 
Yli=i fi z ij = 0, 2 < j < n, which implies 

n 

(/,^-> = 2^/* % = 0, 2<j<n. 

i=l 

This means that / is orthogonal to S so that 



dist ( 2 Z H' S ') = 2 



l/ll 



,2n 



Therefore 



det(M 2 ) < - 



1/11 



Zn 



det(No) 



l/ll 



;,zn 



Using this equality to replace det(M 2 ) in (|24|) . we have that 



det(M 



(25) 



\det(M x )\<±(± 



J_ 

l/ll 



-7 Z11 



+ U/,2ii)|) det(Mo) 
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and therefore, since Mq is positive semidcfinitc 



EfldettMOI-x^M,})^^ 



-j zn 



+ |</,2 n }|)E(det(Mo)). (26) 



We now turn to E(det(M )). 

Notation For a matrix M and a subset S (respectively R) of its columns (resp. of its 
rows), we denote by M s (resp. Mr) the sub- matrix of M consisting of the columns in S 
(resp. the rows in R). Also, Mji denotes the matrix that consists in erasing the columns 
not in S and the rows not in R. 



Lemma 2.5. Let C = (cij)i,j G 



For ggZ, 1 < q < m, and X € R define 



q rn—q 



Aid 



C q (X) :— C + A q where A q 







Id 



m—q 



i.e the matrix obtained by adding X to the first q diagonal entries of C . 
Then, 

q — \ 

det(C,j(A)) = dct(C) + J2 ( Yl det ( C D) A£ - 

e=i sc{i,....,q}-.#(s)=e 

where S is the complement set of S, with the convention that det(Cg) = 1. 
We set A := a y/nX and write Mo = C + A where 



♦ 



VV f + 2BB* 



C:-- 



7=2 Svt 



ri-1 

■is vSt 



n-l 



n — 1 n—X 



Aid 



\BB* 



and A := 







Id 



Then, by Lemma 12.51 and using that the random variables involved in the expectation of 
Mq are the elements of V and A, which are independent, we obtain 



E(det(M )) = E(det(C))+53 J2 E( det (C|))(cr^) £ E(A <? ). (27) 



=1 SC{1 n-l} 

#(S) = i 



We now bound the expectations appearing here. We first consider E(det(C)). 

Lemma 2.6. Set n, k e N, 1 < k < n. Let A = (a l:j ) hJ ,B e M fex " and C € R(™- 1 ) x ". 
Define 

k n—1 



Q 



A A* + BB 1 



CA l 



AC 1 

cc l 



t(k+n— l)x(fc+n-l) 



21) 



k n n 



Then, 

dct(Q) = det(CC* f ) det(BB*) + ^ f ^^(-1)'+^% dct(5- 5 )det(C J )V. 

#(S)=fc-l i=l 2=1 

Applying this result for fc := n - 1, A := V, B := \/2B and C := (l/\/2)-B we get 

n— 1 n 1 

det(C) = det(RB') det(Bi? t ) + V ( V y^-l)* 4 *' -1 ^ det (v^Bf ) det {-^b') 



#(S)=n-2 i=l 3=1 



Since the random variables Vij = y/2/dj dxu+x\fj are centered and independent, and since 
Var(vij) = 2(dj — 1), we obtain 



E( det (CO) = det(BS*) det(BB') + J2 E ( £ £ ±w « dct (^ B f) det (7=^ 

#(5)=n-2 \ i=l i=l v 2 

n— 1 n 91 

= det(BB t )det(5B i )+ ]T £ ^2(d, - l)2- 2 (dct (Bf )) — (det(B') 

#(S)=n-2 i=l 3=1 

71—1 n „ 

< det(BB*) det(iji?*) + (D - 1) ^ mi( dct ( B f)) (det(F) 

#(S)=n-2 i=l 3 ' 
n-1 

= det(33i3*)('det(BB*) + (D - 1) ^ det (BjB 



(28) 
where in the last equality we applied twice the well-known Cauchy-Binet formula, see for 
example [14]: Form<n,ie R mxn and B e M" xm , 

det(AB)= J2 det(A s )det(B s ). (29) 

Now we compute E( det (C§)) for #(S) = I, 1 < £ < n - 1. 

• For ^ = n — 1 it is obvious that 

det (C§) = (1/2™- 1 ) dek(BB*). (30) 

• For 1 < £ < n — 2, we note that for each Sc{l,...,n-1} with #(<S) = €, we have 

n-X-t n-\ 



C£:=| 



%(%)' + 2%(%)* 



s 



MW< 



^%S* \ „-!-< 



ii?B* 



and we obtain, imitating the computation for the case det(C), 

E(det(Cf)) < d ^!l(det^(i?^) t ) + (D-l) £ det (Bg^Bg^)*) 



1 < i < n - 1 - 



(31) 
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Finally we give an upper-bound for E(A ). 

Lemma 2.7. Let G = (a,ij)i<i t j< n for n > 2 be a real random symmetric matrix such 
that the the random variables {a^,! < i < j '• < n} are independent Gaussian centered, 
Var(a,ij) — 1/n for i ^ j and Var{aij) — 2/n for i = j, and denote by A the maximum 
between and the largest eigenvalue of the matrix G. Then, for 1 < £ < n, 

E(X e ) <2-4 e . 
Plugging Inequalities (|2"5)) , (f5T|) . ([50)1 and Lemma (j2.7[) into Formula (|27|l we obtain 

n-l 

E(det(M )) <dct(S J § t )(det(B J B t ) + (D - 1) ^ dct (BjBt) + 2 n (a^) n - 1 

i—\ 
n-2 n-l- 



+ £2<+V>/^( £ (det(%(%)*) + (D-l) 2 te{B^B^)*)))) 

n-l 

<det(Bi? t )(det(B J B t ) + (D - ljj^det (-BjBl) + 2'W^)™- 1 

i=i 

n-2 

+ £V+VV^( Y. det(%(%)*) + (D-l)(^ + l) ]T det(%(£ 



'tJ y 



#(5)=i #(T)=e+i 



This finally implies, by Identity (|21j) and Inequalities (|23| and (|26| the inequality we will 
focuse on in next step. 



E 



(|det(Z")| -X^yo}/ L = u,VL = 0,( = z) < i(I| /-L,I n \ |' + |(/,z n )|) det(BB< 



. f\\ 

n-l 

det(RB*) + (D - 1) £ det (BjBj) + 2"(o- v / ^)™~ 1 



*/: 



+ ^ 2'+\aVn-y( J2 det(%(%)*) + (D - 1)(* + 1) £ det (%(B 

*=1 #(S)=£ #(T)=i+l 

(32) 

Step 6. We put together the calculations of Steps 4 and 5 to compute an upper bound for 
Pl_{u) following Inequality (fT3| . We will also use the following auxiliary result: 

Lemma 2.8. 

n2n-l n2{n-l)T\ 

det (-BB*) < det(BB t ) < det(BB*), 

■ det(%(%)<) < det(%(%)<) < det(%(%)') for S C {1, ... , n}, #(S) = 



•T) V b V O / / — \ O \ b ' ' — f) 
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Proof. We have B = B H for the diagonal matrix 



2 



H:=\ ■• In. 



2 / I 

73Z 



By Cauchy-Binet formula (|29p . 

n 

det(BB') = J]det(#)det((i3 ¥ )*) 



fc=i 

n n 

= ^(dct(B%|)) 2 = £(det(4)) 2 (det(#)) 2 
fc=i fe=i 

™ r,2(»i-l)j _ „ 

= E^V^(det^)) 2 . 

k=l 

The proof concludes using 

o2n-l o2(n-l)j o2(n-l)-pv n _ 

< < since 4 > 2 and E (det(B fe )) = det(RB'). 

fc=i 

The proof of the second assertion is analogous. □ 

According to Inequalities (|13l) . (pl2"1) . and Identity p^|) . we get: 

Pi(«) <<tv(V) / E(\det(L")\-XrZ»y y/ L = u,VL = 0,t = z) -p L z Ju,0,z) dz 



(Sxl 



<MV) 9(9(1^7^") +|(/^n>|)det(BB*)- 



ft — _L 

(det(BB') + (D - 1)J2 det (B ? S|) + 2'> v ^)™- 1 

n-2 

E2 £+ VV^( g det(%(%)*) + (D-l)(* + l) E det (%(%)')))• 

p^(z)(iz. 



#(S)=f #(T)=M-1 

e -„/2 



(27r)«2«- 1 |det(A)|(det(B J B t )) 1/2 V7i 

Here we notice that | det(A)| is the n- volume of the parallelotope generated in l n by the 
rows of A, that is, in the same way we computed det(M2) in (J25j , we have 



1 ~ 7^\ , . / s St> 1 /o 1 



det(A)| =dist — zn.S) dct^B*) 1 / 2 = -= ( -4^,z n ) (det(BB*)) 



V2 li5 V v ; V2 
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/ 



l/ll 



,1/2 



where like previously S :— span(z2, ...,!„) C M™ is the hyperplane spanned by the the rows 
of B. Therefore, using Cauchy-Schwartz inequality for (//||/||,2ii), applying Lemma 
and the fact that 2™ <T>, we get 

Pl{u)<o v {V) j^- - p- n + / ( • 

J(SxR)" (470" 2 ^Vdet^B*) 7 

ri-1 

• (det(BB') + (D - 1) ^ dct (B ? S|) + 2 , >V^)"~ 1 



^2 £+1 (a^)% E det(%(%)*) + (D-l)(* + l) E det (%(%)*)))• ^- K W^ 



u/2 



#(S)=^ #(T)=M-1 

V2 ,!„_ „ . „r„w_i VD 



J(SxE)" (4tt)™ v 2 VX> 



(^^ det(BS') + (D - 1) £ ^=2 ^m(BjY) + V{a^n) n ~ l 

1=1 

n — 2 

- E^ + VV^( E ^ri=idet(%(%)*) + (D-l)(l+l) £ ^^(^V] 



e -u/2 

i^Pc(z)dz 

^( F )/ 7 ^(^Fii|| + ||/||)v^^-(dct( J gB t ) + 2(D-l)Edet(F I (B I ) t ) + 2(4aV^r- 1 

,-u/2 



ra-2 

iz 

' ' ' '// , n ' " 

«=1 #(S)=* #(T)=«+1 



E(4aV^)'( E 2det(%(%)*) + (D-l)(^ + l) E det (%(%)*)))• ^Pc (*)<** 

«=i x 



where 



H(uX) : - (^MVO^IICnll + \\f\\)VBV ■ (det(B(C)B*(0) + 2(4a^) r 



71—1 71—2 



2(D-l)Edet(B I (C)(B I ) t (C))+ E( 4ff v^)'( E 2det(%(C)(%) t (0) 
»=1 «=l #{S)=l 

(D - 1)(* + 1) E det (%(0(%(C))*))) • ^- 



#(T)=£+1 
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Here 



£(C):=| : n-i and (ii : = (-7=duh, ■ ■ ■ , -j=d\if n )- 

Our next goal is then to bound E(JY(u, 0). We first note that the matrix B(() is indepen- 
dent from Cii, so that the expectation can be factorized as a product of expectations. 



First, using Lemma 12.21 and the definition of / we easily get 

.1, 



E(-||Cn|| + 11/11) < y/2(D-l)n + VBu. 

For the other expectations we apply the following. 

Lemma 2.9. (e.g. JU Lemma 13.6]) Set m < n and let U be an m x n random matrix 
whose elements are independent real standard normal. Then 

n 

E(dct(UU t )) = - —v. □ 

x ' (n — my 

Therefore, since by Lemma [2~2| \B(Q) satisfies the hypothesis of the lemma with m — n~ 1, 
we obtain 

E(det( J B(C)S*(C)) =4™-^! 

and we get similar expressions for the other determinants in K(H(u, C))) : 

,n! 



E(det( J B I (C)(S I ) t (C)))-4 } 



E(det(%(C)5|(C)))=4"- 

E(det(%(C)(%(C)) 4 )=4 



n-1 '_ 
2' 

n\ 






(f + 2)! 
We also apply Formula ©: cr v {V) = 4 v / 27T n+ 5 /(T(n/2)T((n + l)/2)). Therefore 
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E(H{u,Q) 



V2 AV2n n+ 3 



— — (V2(D- lW + VMvTO 
(87r)«r(n/2)r((n+l)/2) VV V ' ; 

4"-V! + 2(4crV^) n - 1 + 2(D - 1) J2 4 "" 2 T 

i=l 



n— 2 



J2(^v^y 



2 • 4* 



(£+1)! 



(D-l)(* + l) 



n- 1 

£ + 1 



(* + 2)! 



j/2 



8"- 1 r(n/2)r((n+l)/2) 



(V2(D - l)n + v / D^)\/DP4"- 1 r 



1 + 2 



(aV^)"- 1 (D-l)(n-l) 



in! 4 

(D- !)(■£ + !) /n- 1\ 1 



n—z 



f + iy (^ + 2)! 



; -u/2 



ra- 1 



^ y(£+i)! 



< 



2»- 1 r(n/2)r((n+l)/2) 



-—U2(D-l)n + V^)VUDn\ 



tn-\ 



e(":W + 2^^x;7": 2 W 



£=0 



,-u/2 



(33) 

Now, we assume n > 3 and we bound this expectation for < u < l/(4D 2 n 5 ) in which 
case, by the bound for a 2 given in (|22|) . c 2 < 4D(D — l)u < 1/n 5 . 

We will use throughout the bounds 1 + x < e x for any x and e x — 1 < 2x for < x < 1. 
The factorial term n! = T(n+1) and the other Gamma functions in the first line of the right- 
hand side of Inequality (|33|) can be bounded through Stirling's formula [U Formula 6.1.38]: 
for any x > 0, 

T(x + 1) = V2W-) V /(12a;) for some < 6 = 6{x) < 1. 

:(£)* <r(x + i)<y2^(^)% 1 /( 12 



so that, 
Also, 



^Tra 



12k) 



VD^ < — f= < \/2(D- l)n( -= ) < 

~ 2%/Dn 5 / 2 " V V2^2(D-l)n\/Dn 5 / 2 / " 

which implies 



y/2(D-l) 
4n 3 



1 



v/2(D-l)n + >/5u < ^2(D - l)n ( 1 + ^) . 



2G 



Therefore, the first line of the right-hand side of Inequality (1551) satisfies 



TOn! 



2»- 1 r(n/2)r((n + l)/2) 



-— (^2(D-l)n + VL> 



2"-V(n-2)^v^l) 



1W Vw- 2/ Vn-1/ V V ; V 4n 3 / 



/ DPV2^(-)"e 1 /( 12 ") 



,l/(12n) 



irfert^r^v^JO + i)^ 



e 3 / 2 y (rc-2)(n-l) 

< 3Dv^n 3/2 (l+^V) fl + — ) <4D\/l>n 3/2 . 
~ V 4n 3 / V 6n/ ~ 

We now turn our attention to the term under brackets in the right-hand side of Inequal- 
ity fl33j. 
We have <jy/n < 1/n 2 . Therefore 

E , Y°^Y + - — r — -£ o w^' 



1=0 

< 



£ 



( 



e=o 
1 \»-i (D-l)fn-l) / 1 \™- 2 



n / 4 v n 

■>-* / 1 (D-l)(n-l)\ / 2(n-2) 

e^(l + — + ^ ii M < (1+ l 2 ; 

n 4 / \ n 

Adding up, since e~ u / 2 < 1, we obtain 



)( 



1 (D-l)(ra-l)\ 
1 + ^ + - t -1 <nD. 

n 2 4 



Pl(«) < E(H(u,C)) < 4D 2 D 1/2 »i 5/2 ^. 



Step 7. We finally complete the proof of Theorem 11.21 

For < a < l/(4D 2 n 5 ), the previous estimate for pl_(u) implies 

¥{L<a)= / p L {u)du ^D 2 !? 1 / 2 ?! 5 / 2 ^. 
Jo 

Let us go back to the starting inequality ©: 

P(k(/) > a) <P f L < — (l + lna)iVj + cxp f - — (In a - ln(lna + 1) 



where we recall that 



N = Y,( n + d "\ : » 



! = 1 



D+2 



By hypothesis in the theorem, a > a n :— 4D 2 n 3 7V 1 / 2 . 

We set a := (1 + \na)N/a 2 and verify a < l/(4D 2 n 5 ). It is enough to verify it with a n - 



(1 + In a n )N 1 



4D 2 n 5 



1 + In a n < 4D 2 n 
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which is satisfied since for D > 2 and n > 3, 

l+lna„ < l+ln(4D 2 )+31nn+ ^ Inn < 4D 2 + ( — +4) Inn < 4D 2 +4D 2 (n-l) = 4D 2 n. 
Therefore, by Inequality ([2]), 

P(«(/))>o)< P(X< ( 1 + 1 " a ) jV ^ +exp(-y(lna-ln(lna + l))) 

„ ,,„ ,,„ _ 1 (l + lna) 1/2 

< 8D 2 P 1/2 n 5/2 Va+- = #„ — : 

a a 

where K n = 8D 2 !? 1 / 2 N l / 2 n 5 / 2 + l Here we uged exp ((_jV/2)(lna - ln(lna + 1))) < 1/a 
for a > 2, N > 10. This proves part (i) of Theorem II .21 



(ii) We verify that K n > a n . It is enough to check 

8D 2 2? 1/2 7V 1/2 n 5/2 >4B 2 n 3 N 1/2 ^=> 2 2? 1/2 >n 1/2 <^=> 42? > n 

which holds because 4 2? > 4 • 2™ > n. 
Therefore we can write 

r+OO H-OO 

E(ln«(/))= / P(ln«(/) >x)dx < \nK n + / P(k(/) > e x ) dx 

JO Jin K„ 

r+oo 

< lnK n + / K n {l + x) 1/2 e- x dx 

JlnK„ 

<\nK n + K n x 1 / 2 e ~ x dx + ^ x- 1/2 e~ x dx 

Jin K„ 2 J ln Kn 

r+oo 



r+oo 

= \nK n + K n {er lnK - {\nK n ) 1 ' 2 ) + K n / x^^e^dx 

J\nK n 
r+oo 

< \nK n + (\nK n y/ 2 + KnilnKn)- 1 / 2 / e~ x dx 

= lnK n + (\liKn) 1 ' 2 + (lllKn)- 1 / 2 . 

Here we used the inequality (1 + a;) 1 / 2 < x 1 ! 2 + ^x^ 1 / 2 for x > and integration by parts. 

3 Auxiliary lemmas 

This section contains the proofs of all the auxiliary results indicated by the symbol <C>, which 
were stated without proof during the text. 

Proof of Lemma f2.ll According to the definition of the Weyl norm, 

n 

\\f\\w = Y, E & ( 34 ) 

»=1 1.71=* 
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where, due to the distribution, the random variables 

a) 

are independent identically distributed (i.i.d.) standard normal. 

It is easy to see that the number of terms in the sum (|M)l is equal to N, so that 

F(\\f\\ 2 w >(l + v)N)=F((e i -l) + --- + (e N -l)>vN)=p( Xl + - N + XN >v) 

where X\, . . . , Xn are i.i.d. random variables having the distribution of £ 2 — 1, £ a normal 

standard random variable. 

The logarithmic moment generating function of £ 2 — 1 is 

ut' n f -A-±ln(l-2A) if A< i 

A(A)=lnE{e A «- 1 )}= 2 ' * 

I +oo ll A > 2 

and its Fenchel-Legendre transform 

J |(x - ln(a: + 1)) ifx>-l 

A(x) = sup(Ax-A(A)) = j ^ ifa;<-l. 

A basic result on large deviations [11, Ch. 2] states that, for any integer m and any x > 0, 

— > a; ) < exp(— mA*(i)). 

to y 

This implies the statement. □ 

Proof of Lemma l2~2l For the first item, from the fact that E(ojOj') = E(oj)E(oj/) = 
for j ^ j' (by the independence of the Oj), we have 

E(/(x)/(y)) - E ( YtWW' ) = E E « a i) 2 )^^ = E (fW = (^,2/> rf - 

For the following items, we observe that we can differentiate under the expectation sign the 
function (x,y) ^ E(f(x)f(yj) = (x,y) d , e.g. 

E(f(x)d k f(y)) = d{{X dy y p (x,y)=dx k ( X ,y) d - 1 

E(d k f(x)d k ,f(y))^d 2 kk ,((x, y ) d )^6 kkl d(x,y) d - 1 +d(d-l)x k ,y k (x,y) d ' 2 . 
This gives the covariances when specializing x = y = e$. □ 

Our next lemma deals with the analytic description of the geometry of the manifold V which 

is used in the proof of Lemma \2. 31 We define the function ip : B2n-i,8 — > M" +1 x M. n+1 by 

means of: 

( C D 

1pfa,...,<Tn,T2,...,T n ,0) = rrr^. , — 

\||0|| n +l \\L>\\ n+ i 
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where i?2n-i,<5 is the open ball in R 2 ™ -1 , centered at the origin and radius S sufficiently 
small, ||-||n+i is the Euclidean norm in R" +1 and the definition of C and D is given in 
several steps by the following: 

. Weseta 1 :=(l-cr 2 2 -...-c72)V2 )ri:=(1 _ r |_..._ T 2 ) i/ 2j 

a(a,r) := - (EJ =2 a 3 r j) /(^i + T i)> n(a,r) '■= \/l + a 2 (a,T). 
' A '■= ^7J (°"l e + £"=2 °J e 3 + a (°"i r ) e i) - and 

B := -^h) ( T i e i + E™=2 T J e i + a(^ r ) e o) ■ 
• C := cos(0/a/2)-4 + sin(9/ v / 2)<Tiei, and D := cos(Q/y/2)B - sin(6»/V2)rie . 
Lemma 3.1. [Geometry of V] 

1. ip is a parametrization of a neighborhood of the point (eo,ei) in the manifold V with 
V(0) = (eo,ei). 

2. For 2 < j < n, 

^(0) = (e„0), ^(0) = (0, ei ) ond ^(0) = -^(ei,-eo). 

Therefore the orthonormal basis Bt (defined in (|12[) ) o/ t/ie tangent space of V at the 
point (eo,ei) satisfies 



B 



T 



O &<<«»-■.&<»■» 



5. T7ie curvatures are given by: 

^(0) = (-e 0) 0); ^-(0,-cx); ^ = -i(e 1> e„) /or 2 < ; < „, 

3 J J J 

d 2 ib d 2 ib d 2 ib 

"(0)= ^-(0)= ^i-(0) = (0,0) for2<j^k<n, 



da j dak dr^drk dajdru 

W^—V^^ (O) = S = (O ' O) ^2<i<«. 

Proof. If 6 is small enough, ip is well defined and is < ^' 00 . It is easy to check that 
(C, D) R n+i = 0, so that ip(a 2 ,.-.,a n ,T 2 ,...,T n ,9) e V. 

A routine calculation of first derivatives allows to check 2 and also implies that if S is small 
enough, -0 is a diffeomorphism from B(0,S) onto its image. The computation of second 
order derivatives is also immediate. □ 

Corollary 3.2. Let us set V := L'(eo,ei) and L" := L"(eo,ei) for the free first order 
and second order derivatives of L at (eo,ei). We use the parametrization introduced in the 
previous Lemma. Consider the function 

L{<J2, ■ ■ ■ ,cr n ,T 2 i . . . ,T n ,9) = L(ip(a 2 , . . . ,a n ,T2,. . . ,r„,6»)) 
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Let M be the symmetric matrix of the linear operator L h '(0) in the canonical basis of I 

( M aa M aT M al 

M=l M Ta M TT M T g | eK (2«-l)x(2«-l) 

\ M 9a Mg T M ee 
where for 2 < j,k < n, 

/ (i"(e j ,0),(e j ,0))-(i',(e ,0)) /or j = fc 
\ (i"(e,,0),(e fc ,0)) /or j^k 

Pi-TT- lor j = k 

tt^- for 1 ^ k 



(M \ (M \ d\LoiP) / T „dj> fy \ I , 9*4, 

{ M aT ) jk = {M TO ) kj = -^-^ = ^ L —(0),—(0)J + ^L,g-g-(0) 

f (L"(e„0),(0, e ,))-i(£',( ei , eo )) for j = k 
\ (L"(e j ,0),(0,e k )) for j^k 

*L_uei L+ aL ) for J = k 

tt^tt- for j 4 k 

dxjdy k ■> ■> 1 



f (L"(0, ej ),(0,e,))-(L',(0, ei )) /or j = fc 
\ (L"(0, ej -),(0,e fc )> V j^k 

dy'j dyi J ■' 



9yjdy k 



for j ^ k 



for 2 < j < n, 



1 ir»< nw u l ( d2L d2L 

= - {L (e^ieu-eo)) = — ^ - 



= — =(L (0,ej),(ei,-e )) = — = 



\/2 ' \/2 \dyjdx! dyjdyoj ' 
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and finally 



*W> / L ,,e 4i0)M{0}) + ( L ,^ m 



d9 2 \ d9 y n dd y 7 \ ' dO 2 

X - «L"(ei, -e ), (ei, -e )) - (I/, (e , d))) 

1 /d 2 L <9 2 L d 2 L d£ d£ ' 

2 \9x 2 dxidy dy 2 dx Q dy x 



Proof of Lemma 12.41 We have: 



E(. g (x)/xy + z = u ,y = o)=/ #) ^Mli£|M dl (35) 

since 

Px,y.xy+z{x, 0,u) 

Py,xy+z(0,u) 

is the conditional density of X at the point x, given that Y = 0, X7 + Z = it. 

Now, the density px,y,xy+z (x,y,u) is easily computed from the change of variables formula 

(using the independence of X, Y, Z) , obtaining: 

Px,Y,XY+z(x,y,u) = Px(x)py(v)Pz(u - xy). 

This also implies 



Py,xy+z(0,u) = / Px,y,xy+z(x, 0,w) dx =Py(0)pz(u). 

JRPX7 

Replacing py,xy+z{0,u) by py(0)pz(w) in (J35J) , we get: 

E(g(X) I XY + Z = u, Y = 0) = / g(x)p x (x) dx = E(g(X)) 

jRpx, 

□ 



PROOF of Lemma 12751 Write the Taylor expansion of dct(C 9 (A)) at A = and compute 
the successive derivatives at this point. □ 



Proof of Lemma [2761 We note that Q = M M* where 



A 

M :=\ 

V 

Applying the Cauchy-Binet formula ([29)) . we get 



B * k 



n-l 



det(Q) = J2 (det(M s ')) 2 , (36) 

#(S')=k+n-l 
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where the sum is over all choices of fc + n — 1 columns of M. 

We fix such an S' . It is easy to see, performing a Laplace expansion with respect to the first 
k rows of the obtained matrix, that if we take strictly more than k columns in the n-columns 
right block corresponding to B, then det(M s ) = 0. This is because in this expansion there 
will always remain a zero column. Therefore, we can only choose up to k columns in the 
right block, i.e. there are two cases: we choose all the n columns in the left block and k — 1 
columns in the right block, or we choose n — 1 columns in the left block and k columns in 
the right block. 

Case 1: M s is of the form: 

n fc— 1 



M b = 



C 



/ n-l 



fe 

F ]lj(fc+?i-l)x(A:+n-l) 



Here S is the set of (k — 1) columns of B that we kept. Again using Laplace expansion with 
respect to the last fc — 1 columns of M s , we see that each non-zero determinant corresponds 
to suppressing a row -say row i- of B s , times the determinant of its complementary matrix 
which is equal to the i-th row of A added to C. Finally, expanding this last matrix by the 
i-th row of A, we obtain: 

k n 

det(M s ') = (-l)"^- 1 ) ^(-l) fe - 4 det(-Bf £(-l) i_1 aij det(C J ), 

i=l j=\ 

where i and j denote the complementary rows or columns, accordingly. 

Case 2: M is of the following form for some j which corresponds to the suppressed column 
of A and S is a choice of fc columns of B: 



-1 k 



M s ' = I 



A> 



CP 



B i 



k 

F ]B>(fc+ri-l)x(fc+n-l) 



/ ™-i 



Then, permuting the two blocks of rows and since the obtained matrix is block-diagonal, 
we get det(Af s ) = (-l) k ( n ~V det(C^)det(B s ). 

Therefore, the sum in ([5r?|) for all S' in Case 2 gives: 

n 

^(det(C J )) 2 ^2 (det{B s )) 2 =det{CC t )det(BB t ), 

again by the Cauchy-Binet formula (|29[) . The statement follows from adding up over all S' 
in Cases 1 and 2. □ 



Proof of Lemma 12.71 The proof is based on the following bound for the tails of the 

probability distribution of A. For t > one has (see for example |T0] and references therein): 
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"(A > 2 + V2 t) < exp ( - — ) 



Therefore, since £ < n. 



E(A ) = / P(A >x)dx = / ¥(X > y) £y e - 1 dy 
Jo Jo 

v. r , fi + ^-v , «p(-^)*. 



/n m 2 



«■ J^ v V2ri' 

/"2 /"+ 00 /7T 

< ^ + £ \ — 2 £_1 / exp (wW ) du since 1 + a; < exp(x) 

V n J \/^ l V 2 2 

= 4 £ + ^/^2 £ - 1 cxp(n/4) / +0 ° cxp(-^)dy 
V n J^fi 2 

S 4« + /^2»-p(2)^ B p(-2) SS .4' 



where in the last line we used that 



X ^{-^)dy<j a -exp(- T jd y =-exp(- y j. D 
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